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Abstract. It has been known since 1970's that the A'^-dimensional £i- 
space contains nearly Euclidean subspaces whose dimension is n{N). 
However, proofs of existence of such subspaces were probabilistic, hence 
non-constructive, which made the results not-quite-suitable for subse- 
quently discovered applications to high-dimensional nearest neighbor 
search, error-correcting codes over the reals, compressive sensing and 
other computational problems. In this paper we present a "low-tech" 
scheme which, for any 7 > 0, allows us to exhibit nearly Euclidean n{N)- 
dimensional subspaces of £^ while using only TV' random bits. Our re- 
sults extend and complement (particularly) recent work by Guruswami- 
Lee-Wigderson. Characteristic features of our approach include (1) sim- 
plicity (we use only tensor products) and (2) yielding almost Euclidean 
subspaces with arbitrarily small distortions. 



1 Introduction 

It is a well-known fact that for any vector x e M^, its £2 and £1 norms are 
related by the (optimal) inequality ||a;||2 < < \/]V||a;||2. However, classical 
results in geometric functional analysis show that for a "substantial fraction" of 
vectors , the relation between its 1-norm and 2-norm can be made much tighter. 
Specifically, [FLM77,Kas77,GG84] show that there exists a subspace E cR^ oi 
dimension m = aN , and a scaling constant S such that for all x e E' 

!/£>• \/]V||x||2 < S'||x||i < \/iV||a;||2 (1) 

where a £ (0,1) and D = D{a), called the distortion of E, are absolute (notably 
dimension-free) constants. Over the last few years, such "almost-Euclidean" sub- 
spaces of £^ have found numerous applications, to high-dimensional nearest 
neighbor search [IndOO] , error-correcting codes over reals and compressive sens- 
ing [KT07,GLR08,GLW08], vector quantization [LV06], oblivious dimensionality 
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reduction and e-samples for high-dimensional half-spaces [KRS09] , and to other 
problems. 

For the above applications, it is convenient and sometimes crucial that the 
subspace E is defined in an explicit manner^. However, the aforementioned re- 
sults do not provide much guidance in this regard, since they use the probabilistic 
method. Specifically, cither the vectors spanning or the vectors spanning the 
space dual to E, are i.i.d. random variables from some distribution. As a result, 
the constructions require Q{N'^) independent random variables as starting point. 
Until recently, the largest explicitly constructiblc almost-Euclidean subspace of 
£f , due to Rudin [Rud60] (cf. [LLR94]), had only a dimension of e{^/N). 

During the last few years, there has been a renewed interest in the prob- 
lem [AM06,Sza06,Ind07,LS07,GLR08,GLW08], with researchers using ideas gained 
from the study of expanders, extractors and error-correcting codes to obtain sev- 
eral explicit constructions. The work progressed on two fronts, focusing on (a) 
fully explicit constructions of subspaccs attempting to maximize the dimension 
and minimize the distortion [Ind07,GLR08], as well as (b) constructions using 
limited randomness, with dimension and distortion matching (at least qualita- 
tively) the existential dimension and distortion bounds [Ind00.AM06,LS07,GLW08]. 
The parameters of the constructions are depicted in Figiirc 1. Qualitatively, 
they show that in the fully explicit case, one can achieve either arbitrarily low 
distortion or arbitrarily high subspace dimension, but not (yet?) both. In the 
low-randomness case, one can achieve arbitrarily high subspace dimension and 
constant distortion while using randomness that is sub-linear in TV; achieving 
arbitrarily low distortion was possible as well, albeit at a price of (super)-linear 
randomness. 
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Fig. 1. The best known results for constructing almost-Euclidean subspaces of £1 . The 
parameters e,ri,y G (0, 1) are assumed to be constants, although we explicitly point 
out when the dependence on them is subsumed by the big-Oh notation. 



^ For the purpose of this paper "explicit" means "the basis of E can be generated 
by a deterministic algorithm with running time polynomial in A'^." However, the 
individual constructions can be even "more explicit" than that. 



Our result In this paper we show that, using sub-Unear randomness, one can 
construct a subspace with arbitrarily small distortion while keeping its dimension 
proportional to N. More precisely, we have: 

Theorem 1 Let e,7 G (0,1)- Given N € N, assume that we have at our dis- 
posal a sequence of random bits of length mauX.{N'^ ,C{e,^)}log{N/(ej)). Then, 
in deterministic polynomial (in N) time, we can generate numbers M > 0, 
m > c{e,j)N and an m- dimensional subspace of E, for which we have 

Va; €E, (1 - e)M||a;||2 < ||a;||i < (1 + e)M||a;||2 

with probability greater than 98%. 

In a sense, this complements the result of [GLW08] , optimizing the distortion 
of the subspace at the expense of its dimension. Our approach also allows to 
retrieve - using a simpler and low-tech approach - the results of [GLW08] (see 
the comments at the end of the Introduction). 

Overview of techniques The ideas behind many of the prior constructions as 
well as this work can be viewed as variants of the related developments in the 
context of error-correcting codes. Specifically, the eonstructioii of [IndOT] resem- 
bles the approach of amplifying minimum distance of a code using expanders 
developed in [ABN+92], while the constructions of [GLR08,GLW08] were in- 
spired by low-density parity check codes. The reason for this state of affairs is 
that a vector whose ^\ norms and £2 norms are very different must be "well- 
spread", i.e., a small subset of its coordinates cannot contain most of its £2 
mass (cf. [Ind07,GLR08]). This is akin to a property required from a good error- 
correcting code, where the weight (a.k.a. the norm) of each codeword cannot 
be concentrated on a small subset of its coordinates. 

In this vein, our construction utilizes a tool frequently used for (linear) error- 
correcting codes, namely the tensor product. Recall that, for two linear codes 
Ci C {0, and C2 C {0, their tensor product is a code C C {0, l}"!"^^ 
such that for any codeword c G C (viewed as an ni x n2 matrix), each column of 
c belongs to C\ and each row of c belongs to C2 . It is known that the dimension 
of C is a product of the dimensions of C\ and C2, and that the same holds 
for the minimum distance. This enables constructing a code of "large" block- 
length N'^ by starting from a code of "small" block-length N and tensoring it k 
times. Here, we roughly show that the tensor product of two subspaces yields a 
subspace whose distortion is a product of the distortions of the subspaces. Thus, 
we can randomly choose an initial small low-distortion subspace, and tensor it 
with itself to yield the desired dimension. 

However, tensoring alone does not seem sufficient to give a subspace with 
distortion arbitrarily close to 1 . This is because we can only analyze the distortion 
of the product space for the case when the scaling factor S in Equation 1 is 
equal to 1 (technically, we only prove the left inequality, and rely on the general 
relation between the i?2 and ^\ for the upper bound). For S = 1, however, the 
best achievable distortion is strictly greater than 1, and tensoring can make it 



only larger. To avoid this problem, instead of the £^ norm wo use the £^ {^2) 
norm, for a "small" value of B. The latter norm (say, denoted by || • ||) treats 
the vector as a sequence of N/B "blocks" of length B, and returns the sum of 
the £2 norms of the blocks. We show that there exist subspaces E C ) 
such that for any x G E we have 



for D that is arbitrarily close to 1. Thus, we can construct almost- Euclidean 
subspaces of ^1(^2) of desired dimensions using tensoring, and get rid of the 
"inner" £2 norm at the end of the process. 

We point out that if we do not insist on distortion arbitrarily close to 1, 
the "blocks" are not needed and the argument simplifies substantially. In par- 
ticular, to retrieve the results of [GLW08], it is enough to combine the scalar- 
valued version of Proposition 1 below with "off-the-shelf random constructions 
[Kas77,GG84] yielding - in the notation of Equation 1 - a subspace E, for which 
the parameter a is close to 1. 

2 Tensoring subspaces of Li 

We start by defining some basic notions and notation used in this section. 

Norms and distortion In this section we adopt the "continuous" notation for 
vectors and norms. Specifically, consider a real Hilbert space T-L and a probability 
measure /U over [0, 1]. For p e [l,oo] consider the space Lp{'H) of 'H-valued p- 
integrable functions / endowed with the norm 



In what follows we will omit jj, from the formulae since the measure will be 
clear from the context (and largely irrelevant). As our main result concerns 
finite dimensional spaces, it suffices to focus on the case where /i is simply the 
normalized counting measure over the discrete set {0, 1/n, . . . {n—l)/n} for some 
fixed n e N (although the statements hold in full generality). In this setting, the 
functions / from Lp(T-L) are equivalent to n-dimensional vectors with coordinates 
in 'H.* The advantage of using the Lp norms as opposed to the £p norms that 
the relation between the 1-norm and the 2-norm docs not involve scaling factors 
that depend on dimension, i.e., we have ||/||2 > ||/||i for all / € L2{'H) (note 
that, for the Lp norms, the "trivial" inequality goes in the other direction than 
for the ip norms). This simplifies the notation considerably. 

The values from H roughly correspond to the finite-dimensional "blocks" in the 

construction sketched in the introduction. Note that H can be discrctizcd similarly 
as the I/p-spaces; alternatively, functions that are constant on intervals of the type 
( (fe — i)/N, k/Nj can be considered in lieu of discrete measures. 



1/D-./N/B\\x\\2 < \\x\\ < ^/N/B\\x\\2 




We will be interested in lialmost subspaces E c L2{'H) on which the 1-norm 
and 2-norm uniformly agree, i.e., for some c G (0, 1], 



II/II2 > ll/lli > c\\fh (2) 

for all f G E. The best (the largest) constant c that works in (2) will be denoted 
Ai{E). For completeness, we also define Ai{E) = if no c > works. 

Tensor products If are Hilbert spaces, 7^ 02 ^ is their (Hilbertian) tensor 
product, which may be (for example) described by the following property: if (cj) 
is an orthonormal sequence in T-L and {fk) is an orthonormal sequence in /C, 
then (cj ^ fk) is an orthonormal sequence in 7^ 02 ^ (a basis if (ej) and (/fc) 
were bases). Next, any element of L2{H) /C is canonically identified with a 
function in the space L2{'H 02 fC); note that such functions are H /C- valued, 
but are defined on the same probability space as their counterparts from L2{H). 
If C i2('H) is a linear subspace, £^0 /C is - under this identification - a linear 
subspace of L2{'H 02 IC). 

As hinted in the Introduction, our argument depends (roughly) on the fact 
that the property expressed by (1) or (2) "passes" to tensor products of sub- 
spaces, and that it "survives" replacing scalar-valued functions by ones that 
have values in a Hilbert space. Statements to similar effect of various degrees 
of generality and precision are widely available in the mathematical literature, 
see for example [MZ39,Bec75,And80,FJ80]. However, we are not aware of a ref- 
erence that subsumes all the facts needed here and so we present an elementary 
self-contained proof. 

We start with two preliminary lemmas. 
Lemma 1 If g\,g2, ■ ■ ■ G E c L2i^), then 

fe fe 

Proof Let /C be an auxiliary Hilbert space and (e^) an orthonormal sequence 
(O.N.S.) in /C. We will apply Minkowski inequality a continuous version of 
the triangle inequality, which says that for vector valued functions \\ j h\\ < 
J \\h\\ - to the /C-valued function h{x) = ||fl'fe(a;)||-H e^. As is easily seen, 

1/2 

WlHic = \\T,k{I\\9k{x)\\'Hdx)ek\\ic = (Efe ll5fe|li,(H)) • Given that gk € 
E, llfl'felUi(w) > Ai{E) \\gk\\L2{H) and so 



>A^{E)nj2\\9k{x)\\ndx 
^ k 



1/2 



On the other hand, the left hand side of the inequality in Lemma 1 is exactly 
/ so the Minkowski inequality yields the required estimate. 

We are now ready to state the next lemma. Recall that is a linear subspace 
of L2{'H), and /C is a Hilbert space. 



Lemma 2 Ai{E K.) ^ Ai (E) 



li E c L2 = I'2(IR), the lemma says that any estimate of type (2) for scalar 
functions f G E carries over to their linear combinations with vector coefficients, 
namely to functions of the type J2j "^jfj^ fj € E,Vj € /C. In the general case, 
any estimate for "H-valued functions f € E C L2{'H) carries over to functions of 

the form J2j fj ® '-'j '= L2{'H (812 A^), with fj e E, vj € /C. 

Proof of Lemma 2 Let (cfe) be an orthonormal basis of /C. In fact w.l.o.g. we may 
assume that K = I2 and that (efe) is the canonical orthonormal basis. Consider 

g = fj ® Vj, where fj € E and Vj € /C. Then also g = J2k 9k for some 

1 /2 

Qk & E and hence (pointwisc) \\g{x)\\H®^K {Y.k\\9k{x)\\l^ . Accordingly, 

h\\L^{n®2K) = (/Efe \\gk{x)\\l^dxf^'^ , while \\g\\L^(H®^K) =/(Efe \\9k{x)f^f''^ dx. 
Comparing such quantities is exactly the object of Lemma 1, which implies that 
\\9\\Lt{H®2K.) > ^i{E)\\g\\L2{n»2ic)- Since g € E(»IC was arbitrary, it follows that 
Ai{E (g) /C) > Ai{E). The reverse inequality is automatic (except in the trivial 
case dim/C = 0, which we will ignore). 

If i? C L2{'H) and F C iv2(/C) arc subspaccs, E (S) F is the subspacc of 
L2{'H <E)2 K-) spanned by f (E) g with f G E,g E F. (For clarity, / (E) g is a 
function on the product of the underlying probability spaces and is defined by 
{x,y)^ f{x)®g{y)en®JC.) 

The next proposition shows the key property of tensoring almost-Euclidean 
spaces. 



Proposition 1. Ai{E (g) F) > Ai{E)Ai{F) 

Proof Let {(pj) and (ipk) bo orthonormal bases of respectively E and F and let 
9 = Ej.feijfc'/'i^V'fe- We need to show that ||5||li(w®2/c) > ME)Ai{F)\\g\\L^(^u®^ic), 
where the p-norms refer to the product probability space, for example 

3>k 

Rewriting the expression under the sum and subsequently applying Lemma 2 to 
the inner integral for fixed y gives 

/ II Y. w « <^ = / II E ^,(^) « ( E *M) ||„... ^ 

J A I. j k 

j k 



j,k 

> 



1/2 



k 

1/2 



In turn, J2k ^Jk i^k & F (for all j) and so, by Lemma 1, 

/ (E WY^^oMyt^'" dy > A^{F)[ f J2 \\J2*ikMy)\\ldyy^' 
j k •' j k 

= MF) \\9\\L2{n»2ic)- 
Combining the above formulae yields the conclusion of the Proposition. 

3 The construction 

In this section wc describe our low-randomness construction. Wc start from a 
recap of the probabilistic construction, since we use it as a building block. 

3.1 Dvoretzky's theorem, and its "tangible" version 

For general normed spaces, the following is one possible statement of the well- 
known Dvoretzky's theorem [Dvo61]: 

Given m G N and e > there is N = N{m,e) such that, for any norm on 
there is an m- dimensional subspace on which the ratio of £i and li norms is 
(approximately) constant, up to a multiplicative factor 1+e. 

For specific norms this statement can be made more precise, both in describing 
the dependence N = N{m, e) and in identifying the constant of (approximate) 
proportionality of norms. The following version is (essentially) due to Milman 
[Mil71]. 

Dvoretzky's theorem (Tangible version) Consider the N -dimensional Eu- 
clidean space (real or complex) endowed with the Euclidean norm \\ -112 and some 
other norm \\ ■ \\ such that, for some 6 > 0, || • || < 6|| • ||2. Let M = E||X||, wJiere X 
is a random variable uniformly distributed on the unit Euclidean sphere. Then 
there exists a computable universal constant c > 0, so that if < e < 1 and 
m < ce'^{M/b)'^N, then for more than 99% (with respect to the Haar measure) 
m-dimensional subspaces E we have 

yxeE, (l-£)M||:c||2 < ll^ll < (l + £)M||x||2. (3) 

Alternative good expositions of the theorem are in, e.g., [FLM77], [MS86] and 
[Pis89]. We point out that standard and most elementary proofs yield m < 
ce'^/log{l/e){M/b)^N; the dependence on e of order was obtained in the 
important papers [Gor85,Sch89] , see also [ASWIO]. 

3.2 The case of 

Our objective now is to apply Dvoretzky's theorem and subsequently Proposition 
1 to spaces of the form ^"(^f) for some n,B e N, so from now on we set 



II ' II ■= II ■ ll^f (€f ) To that end, we need to determine the values of the parameter 
M that appears in the theorem. (The optimal value of b is clearly ^Jn, as in 
the scalar case, i.e., when B = 1.) We have the following standard (cf. [Bal97], 
Lecture 9) 

Lemma 3 

_rfB+i) r(—) 

M{n,B) :=E,eS"B-i = ^(1) ^.^b+I) 

In particular, ^1 + \P^- > M(n, 1) > \j\\pn, for all n £ N (the scalar 

case) and M{n, B) > ^Jl- ^ \/n for all n,B gN. 

The equality is shown by relating (via passing to polar coordinates) spheri- 
cal averages of norms to Gaussian means: if X is a random variable uniformly 
distributed on the Euclidean sphere S^~^ and Y has the standard Gaussian 
distribution on M.^ , then, for any norm || • ||, 

ny\\ = ' m\\ 



The inequalities follow from the estimates \J x — ^ < ^^p^^^ ^ < \fx (for a; > i), 
which in turn are consequences of log-convexity of F and its functional equation 
-r(y + 1) = yr{y). (Alternatively, Stirling's formula may be used to arrive at a 
similar conclusion.) 

Combining Dvorctzky's theorem with Lemma 3 yields 

Corollary 1 // < £ < 1 and m < cie^n, then for more than 99% of the 
m- dimensional subspaces E c £i we have 

VxeE {i-s)^VE\\x\\2 < \\x\\i < (i + £)^i + :^^yf\/^l|a;||2 (4) 

Similarly, if B > 1 and m < c^e'^nB, then for more than 99% of the m- 
dimensional subspaces E c iiii^) have 



VxGE (l-£)^l-lv^||a;||2<||a;||< V^||x||2 (5) 

We point out that the upper estimate on ||a;|| in the second inequality is valid 
for all X G f^ii^i) ^■iid, like the estimate M(n,B) < y/n, follows just from the 
Cauchy-Schwarz inequality. 

Since a random subspace chosen uniformly according to the Haar measure 

on the manifold of m-dimcnsional subspaces of M.^ (or C^) can be constructed 
from an N X m random Gaussian matrix, we may apply standard discretization 
techniques to obtain the following 



Corollary 2 There is a deterministic algorithm that, given e,B,m,n as in 
Corollary 1 and a sequence of 0{mn\og{mn/e)) random bits, generates sub- 
spaces E as in Corollary 1 with probability greater than 98%, in time polynomial 
in 1/e + B + m + n. 

We point out that in the Hteraturc on the "randomness-reduction" , one typ- 
ically uses Bernoulli matrices in lieu of Gaussian ones. This enables avoiding the 
discretization issue, since the problem is phrased directly in terms of random 
bits. Still, since proofs of Dvoretzky type theorems for Bernoulli matrices are 
often much harder than for their Gaussian counterparts, we prefer to appeal in- 
stead to a simple discretization of Gaiissian random variables. We note, however, 
that the early approach of [Kas77] was based on Bernoulli matrices. 

We are now ready to conclude the proof of Theorem 1. Given e e (0,1) 
and n e N, choose B — \e^^~\ and m = [ce^(l — ^)nB\ > cos'^nB. Corollary 
2 (Equation 5) and repeated application of Proposition 1 give us a subspace 
F C ^1 (^2) (where v = and /3 = B^) of dimension m'^ > {cQ£'^Yvj3 such that 

Vx e f (1 - £)3'=/2n'=/2||a;||2 < ||a;|| < n^'^xh- 

Moreover, F = E®E®...®E, where E C (^^) is a typical m-dimensional 
subspace. Thus in order to produce E, hence F, we only need to generate a "typi- 
cal" m K. CQe'^{vP)Y/^ subspace of the nB = (i//3))^/'^-dimensional space ^1 (^f )• 
Note that for fixed e and fc > 1, nB and m are asymptotically (substantially) 
smaller than dimF. Further, in order to efRciently represent f as a subspace of 
an ^i-space, we only need to find a good embedding of into t\. This can be 
done using Corollary 2 (Equation 4); note that /3 depends only on e and k. Thus 
we reduced the problem of finding "large" almost Euclidean subspaces of to 
similar problems for much smaller dimensions. 

Theorem 1 now follows from the above discussion. The argument gives, e.g., 
c(e,7) = {ce^)^^'' and C{£,^) = c(e,7)~-^. 
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